A 12th-grade student, Adi Singh, has developed Minecraft Benchmark (MC-Bench)—a website where users can challenge AI models to Minecraft build-offs and vote on the best creations. The twist? Voters don’t know which AI built what until after they vote.
Minecraft, the best-selling video game of all time, provides a visual, intuitive way to assess AI capabilities. Even those unfamiliar with the game can compare AI-generated structures, making AI progress easier to grasp.
🟢 AI vs AI – Models from Anthropic, Google, OpenAI, and Alibaba compete in head-to-head build-offs.
🟢 Crowdsourced Judgment – Users vote on builds before knowing which AI made them.
🟢 Beyond Simple Builds – Singh envisions expanding into complex, goal-oriented AI tasks.
Traditional AI benchmarks are struggling to capture real-world intelligence, leading researchers to turn to games like Pokémon Red, Street Fighter, and Pictionary to test AI reasoning in controlled, creative environments.
With MC-Bench, Singh believes Minecraft could become a key AI testing ground, offering a fun, interactive way to evaluate AI progress.